Clustering and latent semantic indexing aspects of the singular value decomposition
نویسندگان
چکیده
منابع مشابه
Clustering and Latent Semantic Indexing Aspects of the Singular Value Decomposition
This paper discusses clustering and latent semantic indexing (LSI) aspects of the singular value decomposition (SVD). The purpose of this paper is twofold. The first is to give an explanation on how and why the singular vectors can be used in clustering. And the second is to show that the two seemingly unrelated SVD aspects actually originate from the same source: related vertices tend to be mo...
متن کاملUpdating the partial singular value decomposition in latent semantic indexing
Latent semantic indexing (LSI) is a method of information retrieval that relies heavily on the partial singular value decomposition (PSVD) of the term-document matrix representation of a dataset. Calculating the PSVD of large term-document matrices is computationally expensive; hence in the case where terms or documents are merely added to an existing dataset, it is extremely beneficial to upda...
متن کاملTwo Uses for Updating the Partial Singular Value Decomposition in Latent Semantic Indexing
Latent Semantic Indexing (LSI) is an information retrieval (IR) method that connects IR with numerical linear algebra by representing a dataset as a term-document matrix. Because of the tremendous size of modern databases, such matrices can be very large. The partial singular value decomposition (PSVD) is a matrix factorization that captures the salient features of a matrix, while using much le...
متن کاملUsing Random Indexing to improve Singular Value Decomposition for Latent Semantic Analysis
We present results from using Random Indexing for Latent Semantic Analysis to handle Singular Value Decomposition tractability issues. We compare Latent Semantic Analysis, Random Indexing and Latent Semantic Analysis on Random Indexing reduced matrices. In this study we use a corpus comprising 1003 documents from the MEDLINE-corpus. Our results show that Latent Semantic Analysis on Random Index...
متن کاملDouble Clustering in Latent Semantic Indexing
Document clustering is a widely researched area of information retrieval. The large amount of documents which must be handled needs automatic organizing. A popular approach to clustering documents and messages is the vector space model, which represents texts with feature vectors, usually generated from the set of terms contained in the message. The clustering based on the document-term frequen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Information and Decision Sciences
سال: 2016
ISSN: 1756-7017,1756-7025
DOI: 10.1504/ijids.2016.075790